Genome-wide evolution and expression analysis of the MYB-CC gene family in Brassica spp.

The MYB-CC family is a subtype within the MYB superfamily. This family contains an MYB domain and a predicted coiled-coil (CC) domain. Several MYB-CC transcription factors are involved in the plant’s adaptability to low phosphate (Pi) stress. We identified 30, 34, and 55 MYB-CC genes in Brassica rapa, Brassica oleracea, and Brassica napus, respectively. The MYB-CC genes were divided into nine groups based on phylogenetic analysis. The analysis of the chromosome distribution and gene structure revealed that most MYB-CC genes retained the same relative position on the chromosomes and had similar gene structures during allotetraploidy. Evolutionary analysis showed that the ancestral whole-genome triplication (WGT) and the recent allopolyploidy are critical for the expansion of the MYB-CC gene family. The expression patterns of MYB-CC genes were found to be diverse in different tissues of the three Brassica species. Furthermore, the gene expression analysis under low Pi stress revealed that MYB-CC genes may be related to low Pi stress responses. These results may increase our understanding of MYB-CC gene family diversification and provide the basis for further analysis of the specific functions of MYB-CC genes in Brassica species.

INTRODUCTION Phosphorus (P) is an essential macronutrient for plant growth and development. It plays a vital role in a number of processes, including metabolic regulation, energy transfer, and protein activation (Marschner & Marschner, 2012). P is transported into the plant from the soil, which is key to maintaining P levels in plant cells . P is abundant in many soils, however, plants utilize a phosphate (Pi) form that is very scarce (Holford, 1997). Consequently, plants often encounter Pi deficiency in agricultural and natural systems' soils (Raghothama, 1999). In order to improve the absorption and usage of Pi in the soils, plants have evolved many adaptive responses to low Pi stress, known as Pi starvation response (PSR) (Raghothama, 1999;Williamson et al., 2001;Vance, Uhde-Stone & Allan, 2003). Morphologically, the architecture of the root system is altered under low Pi stress, which leads to a high root-shoot ratio and the root hair proliferation to enhance the total surface area for soil exploration (Williamson et al., 2001;Lopez-Bucio et al., 2002). Plants increase the availability of endogenous and exogenous inorganic Pi by increasing the P-replacing enzyme activity in metabolites and structural compounds and by releasing organic acids into soil solution (Duff, Sarath & Plaxton, 1994;Raghothama, 1999;Vance, Uhde-Stone & Allan, 2003). Furthermore, the root tips of some plants combine with mycorrhizal fungi to form mutually beneficial symbionts to modify soil scavenging potential (Harrison, 1999;Watt & Evans, 1999).
In Arabidopsis' PSR, a subset of genes is induced or suppressed (Wu et al., 2003;Misson et al., 2005). Pi starvation-induced (PSI) genes may be regulated by several transcription factors, of which PHOSPHATE STARVATION RESPONSE 1 (PHR1) is the key transcription factor (TF) in vascular plants (Rubio et al., 2001;Devaiah, Karthikeyan & Raghothama, 2007;Devaiah et al., 2009;Ren et al., 2012). In Arabidopsis, PHR1 with a MYB domain and a predicted coiled-coil (CC) domain belongs to a 15-member family of MYB-CC transcription factors (Rubio et al., 2001). PHR1 and its homologues are also members of GARP subfamily transcription factors as they contain a consensus sequence (SHLQ(K/M)(Y/F)) similar to the motif (SHAQK(Y/F)F) found in MYB-related proteins (Du et al., 2013;Safi et al., 2017). PHR1 may bind to the specific cis-element (PHR1-binding sequence, P1BS) with an imperfect palindromic sequence GNATATNC in the promoter of PSI genes (Rubio et al., 2001). The Arabidopsis phr1 mutant displays a decreased expression of a subset of PSI genes and modified PSR, including a reduced Pi content and the accumulation of anthocyanin (Rubio et al., 2001;Misson et al., 2005). However, the overexpression of PHR1 in transgenic Arabidopsis increased the expression of a subset of PSI genes, the Pi content, and anthocyanin accumulation (Nilsson, Muller & Nielsen, 2007). The binding of PHR1 to P1BS elements in PSI gene promoters may be modulated by the SPX domain-containing proteins, such as SPX1 and SPX2, which interact with PHR1 and its orthologs to inhibit their activity, thereby suppressing PSR (Puga et al., 2014;Wang et al., 2014;Wild et al., 2016). PHR1 and the other MYB-CC members (PHL1, PHL2, PHL3 and PHL4) are crucial components of the central regulatory system controlling Arabidopsis' transcriptional responses to Pi starvation (Bustos et al., 2010;Sun et al., 2016;Wang et al., 2018). Several PHR1 ortholog genes have been identified and characterized in different species (Valdes-Lopez et al., 2008;Zhou et al., 2008;Ren et al., 2012;Wang et al., 2013;Wang et al., 2019). The identification of PHR1 homologs and the establishment of their roles in these species have shown that central regulators have highly conserved functions. Furthermore, it suggests that members of the MYB-CC family may play a transcriptional regulatory role in controlling Pi uptake and homeostasis in plants. Therefore, it is crucial to identify MYB-CC genes and explore their regulatory roles in plant Pi uptake and homeostasis.
Brassica napus is a global essential oil crop. It is primarily cultivated for its healthy edible oil that is extracted from the seeds and its utility as a renewable biofuel, which is receiving increased attention. Vegetable types of the species have also been bred for consumption by both human and animals. B. napus is an allopolyploid (AACC, 2n = 38) resulting from natural interspecific hybridization between its two parental species: Brassica rapa (AA, 2n = 20) and Brassica oleracea (CC, 2n = 18) (Beilstein, Al-Shehbaz & Kellogg, 2006;Chalhoub et al., 2014). The genome evolution assay indicates the whole-genome triplication event in the Brassica lineage occurred about 13-17 million years ago (Ma) after it diverged from Arabidopsis thaliana about 20 Ma (Johnston et al., 2005;Town et al., 2006;Yang et al., 2006;Lysak et al., 2007;Mun et al., 2009). B. oleracea and B. rapa are also important vegetable crop species with a wide genetic and morphological diversity. B. rapa and B. oleracea differentiated from their ancestor approximately 3.75 Ma (Inaba & Nishio, 2002). The genome was doubled to form the heterotetraploid B. napus about 10,000 years ago (Cheung et al., 2009). Homologous fragment sequences in B. napus, B. rapa, and B. oleracea genomes indicate that there is a near-perfect microcollinearity between them (Cheung et al., 2009;Wang et al., 2011).
Our previous study revealed that BnPHR1 may play a crucial regulatory role in the Pi starvation response in B. napus (Ren et al., 2012). However, there is still a lack of research on other MYB-CC genes in B. napus and its parental species, B. rapa and B. oleracea. We conducted an evolutionary-based genome-wide analysis of the MYB-CC gene family in Brassica species. The gene structure, chromosome distribution, conserved motifs, phylogenetic relationship, cis-elements, and expression profiles were assayed systematically. These results may illuminate the evolutionary history of the MYB-CC gene family and assist in the investigation of the functions of MYB-CC genes in the Brassica species.

MATERIALS AND METHODS
Plant materials and growth conditions B. napus (accession Zhongshuang11) was kindly provided by the Oilcrops Research Institute, Chinese Academy of Agricultural Sciences. The seeds were disinfected in 70% ethanol for 1 min, washed with ddH 2 O three to four times, placed on a Petri dish lined with two layers of filter paper, and then germinated at 24 C in growth chamber. After 3 days, the germinated seedlings were transferred to MS medium containing 0.3% agar and 3% sucrose at pH 5.7. The seedlings grew at 24 C under a long-day light cycle (16 h light/8 h dark) and a relative humidity of 65%. Then the two-week-old plants were cultured in 1 mM and 1 mM Pi liquid medium for 24 h, respectively. Rapeseed seedlings' shoots and roots were collected and frozen in liquid nitrogen. The RNA-seq analysis was carried out by Biomarker (Beijing). A total of 16 RNA samples were sequenced using the Illumina HiSeq 2,000 platform (Illumina, USA). The transcript abundance (FPKM value) was calculated based on the length of the gene and the reads mapped gene. The heatmap was constructed using TBtools (Chen et al., 2020). The raw RNA-seq data were deposited in the NCBI SRA database (https://ncbi.nlm.nih.gov/sra/) under the accession number PRJNA739537 and the assembled transcriptome data were deposited in the NCBI GEO database (https://www.ncbi.nlm.nih.gov/geo/)under the accession number GSE192452.

Multiple sequence alignment and phylogenetic analysis of MYB-CC family
We constructed multiple sequence alignments of BnaMYB-CC, BraMYB-CC, BolMYB-CC, and AtMYB-CC using Clustal X (Thompson et al., 1997). The unrooted phylogenetic trees were constructed using the Neighbor-Joining (NJ) method. This was done using MEGA 6.0 software with the JTT model and pairwise gap deletion option (Tamura et al., 2013). The bootstrap analysis was conducted with 1,000 iterations.

Chromosomal location and gene duplication
The position of the MYB-CC genes was investigated according to the B. napus v4.1. chromosomal database (http://www.genoscope.cns.fr) and the Brassica database (BRAD, http://brassicadb.org). MapInspect software was used to locate the MYB-CC genes. The gene duplication was defined as: a coverage region between two sequences of more than 70% with more than 85% similarity in the coverage region (Zhou et al., 2004;Yang et al., 2008). Additionally, the ka (non-synonymous mutation rate) and ks (synonymous mutation rate) values of the repeating gene pairs were calculated using DnaSp software.
Gene structure and conserved motif analysis GSDS (http://gsds.gao-lab.org/) was used to determine the exon/intron distribution of BnaMYB-CCs through the comparison of CDS and genomic sequences (Guo et al., 2007). The conserved motifs of BnaMYB-CCs were identified using MEME (Bailey et al., 2006) with the flowing parameters: the maximum number of motif: 10; the optimum width: 6-250; the number of repetitions: any. InterProScan was used to annotate the motifs (Quevillon et al., 2005).

Promoter elements analysis
From BRAD, a sequence of 2,000 bp upstream of the start codon was obtained for each MYB-CC gene and submitted to the PlantCARE website to determine the distribution of cis-elements in MYB-CC promoters (http://bioinformatics.psb.ugent.be/webtools/ plantcare/html/) (Lescot et al., 2002).

Gene expression patterns in different tissues
The expression patterns of MYB-CC genes in different B. rapa, B. oleracea, and B. napus tissues were analyzed with previously published transcriptomic data Yu et al., 2014;Sun et al., 2017). RNA-seq data were collected from the GEO database using accession numbers GSE43245 and GSE42891.

RESULTS
Identification of MYB-CC genes in B. napus, B. rapa, and B. oleracea The B. napus, B. rapa, and B. oleracea genomes were searched using the protein sequences of 15 MYB-CCs in A. thaliana. A total of 55 MYB-CC genes were identified in B. napus and were equally distributed in the A n and C n subgenomes (27 genes in A n and 28 genes in C n , respectively) ( Table 1). A nomenclature system was used to distinguish the MYB-CC genes. The identified MYB-CC genes in B. napus were designated as BnaMYB-CC01 to BnaMYB-CC55 based on their order on A n 01, C n 01, A n 02, C n 02, etc.. The majority of BnaMYB-CCs contained 148 to 526 amino acids with molecular weights ranging from 16.991 kDa to 59.691 kDa. The predicted isoelectric points (pI) varied from 4.99 to 10.55 (Table 1). A total of 30 MYB-CC genes were identified in B. rapa and were designated as BraMYB-CC01 to BraMYB-CC30 based on their order on A r 01-Ann (Table S1). Furthermore, 34 MYB-CC genes were identified in B. oleracea and were assigned as BolMYB-CC01 to BolMYB-CC34 based on their order on C o 01-Cnn (Table S2).

Evolutionary analysis of MYB-CC family
To explore the evolutionary relationship of the MYB-CC gene family, an unrooted phylogenetic tree was generated using the Neiboring-Joining (NJ) method based on the full-length of 134 MYB-CC protein sequences from A. thaliana, B. napus, B. rapa, and B. oleracea. As shown in Fig. 1, the MYB-CC proteins were classified into nine distinct groups based on the branch of the phylogenetic tree. Group E and Group F each had 20 members, which were the larger groups. Group I was the smallest group with only two MYB-CC members. Group A, B, C, D, G, and H contained 19, 14, 14, 16, 13, and 14 members, respectively. One of the Arabidopsis MYB-CC members (coded by At2g01060) was not found to have a homolog in B. napus. The MYB-CC proteins in Group E were homologous to AtPHR1 and AtPHL4, and the members of Group B were homologs of AtPHL2 and AtPHL3. AtPHL1 belonged to Group F (Fig. 1).

Chromosomal distribution of MYB-CC genes
In order to determine the chromosomal distribution of MYB-CC genes, we searched the Brassica genome database and mapped MYB-CC genes to the corresponding chromosomes. The results showed that 50 BnaMYB-CC genes were distributed across 17 chromosomes except for A n 04 and C n 04 (Fig. 2). There were 27 BraMYB-CC genes located on nine chromosomes except for A r 04, and 28 BolMYB-CC genes were distributed on eight chromosomes except for C o 04 (Fig. 2). Five BnaMYB-CC genes (BnaMYB-CC51, BnaMYB-CC52, BnaMYB-CC53, BnaMYB-CC54, and BnaMYB-CC55), three BraMYB-CC genes (BraMYB-CC28, BraMYB-CC29, and BraMYB-CC30), and six BolMYB-CC genes (BolMYB-CC29, BolMYB-CC30, BolMYB-CC31, BolMYB-CC32, BolMYB-CC33, and BolMYB-CC34) were mapped onto unanchored scaffolds (http://www.genoscope.cns.fr and http://brassicadb.org). In the A n subgenome of B. napus, chromosome A n 07 carried the most (seven) MYB-CC genes. Chromosome A n 06 and A n 08 only contained one MYB-CC gene and chromosome A n 04 did not have MYB-CC genes. Comparatively, BnaMYB-CC genes were more prevalent in the C n subgenome. Except for chromosome C n 04, the other chromosomes in the C n subgenome contained two to four MYB-CC genes ( Table S3). The relative position of seven MYB-CC genes changed on the chromosomes after allotetraploidy. By evolving from diploid to allotetraploid, two BraMYB-CC genes were rearranged in the A n subgenome belonging to B. napus. Comparatively, the positions of five MYB-CC genes in the C o and C n were changed, of which three genes were on chromosome C06 (Fig. 2; Table S4). MYB-CC genes were located on several chromosomes, namely A r 03-A n 03, A r 09-A n 09, A r 10-A n 10, and C o 08-C n 08. This was a conservative process compared to the gene distribution of B. napus and its diploid ancestors (Fig. 2). The results indicated that the genetic variation took place during the evolution of the B. napus genome from its diploid progenitors. The A n subgenome was more stable than the C n subgenome, which may be due to the more abundant homologous exchanges or richer transposable elements in the C subgenome (Chalhoub et al., 2014).

Collinearity analysis and gene duplication
Collinearity analysis of the MYB-CC genes was performed on B. napus and its two parental species (B. rapa and B. oleracea). As shown in Fig. 3, the collinear MYB-CC genes were widely distributed in the two subgenomes (A n and C n ) and two genomes (A r and C o ), indicating that they promote evolution in the MYB-CC gene family. We also revealed the extension mechanism of the MYB-CC gene family in B. napus by studying gene duplication events. The results showed that 35 BnaMYB-CC genes formed 36 gene pairs with a high homology between gene and protein sequences. Some genes were involved in the duplication events more than once (Table 2). Among these segmental duplication events, 34 events happened between diverse chromosomes, and two events occurred on the same chromosome (BnaMYB-CC26 and BnaMYB-CC29 on C n 06; BnaMYB-CC32 and BnaMYB-CC35 on A n 07) (Fig. 2). Tandem duplication genes were described as a cluster of duplicated genes within 200 kb (Zhu et al., 2020). However, the two pairs of duplicated genes on the same chromosome were not closely related ( Fig. 2; Table 1), suggesting that they were not the product of tandem duplication events.
The ka (non-synonymous mutation rate) and ks (synonymous mutation rate) were calculated using the DnaSP program to better understand the effects of evolutionary constraints on BnaMYB-CC genes in B. napus. The ratios of ka and ks of two gene pairs (BnaMYB-CC02 and BnaMYB-CC06; BnaMYB-CC10 and BnaMYB-CC13) were higher than one, indicating that the evolutionary process was actively selected and these genes may function redundantly, resulting in fast evolution (Table 2). However, the remaining 34 gene pairs' ka and ks rations were less than one, indicating that most BnaMYB-CC genes were conservative and selected in the evolutionary process, resulting in a slower evolutionary rate (Table 2).

Gene structure and conserved motifs
To better understand the diversity of MYB-CC members, the gene exon/intron structure and conserved protein motifs were investigated. All BnaMYB-CC genes were interrupted by several introns, and the exon/intron distribution was complicated (Fig. 4A). BnaMYB-CC37 contained the most introns (11), while several members in group A had the fewest introns (BnaMYB-CC05, BnaMYB-CC09, and BnaMYB-CC22 with four introns each). The members belonging to the same group showed similar exon/intron arrangements (Fig. 4A). For example, the BnaMYB-CC genes in Group H had five introns with similar positions and lengths, suggesting they may have resulted from gene duplication.
Forty-five pairs of homologous genes, which have the closest genetic distance, were analyzed to further investigate the variation of gene structure after allotetraplpidy. The results indicated that the genes were clustered together at the terminal level of the phylogenetic tree (Table S5). Seven pairs of homologous genes showed intron loss/gain  (Table S5). In contrast to their ancestral genes, most BnaMYB-CC genes had an identical exon/intron phase (Fig. 4A). It should be noted that intron acquisition events mainly occurred in group C. Specifically, BnaMYB-CC01, BnaMYB-CC20, and BnaMYB-CC54 had one more intron than their homologous genes in the diploid progenitors (Table S5). The majority of the homologous pairs retained consistent number and a similar phase of exon/intron, indicating that MYB-CC gene structures was conserved. Ten conserved motifs (motifs 1-10) with lengths ranging from 15-50 amino acids were identified using the MEME program (Fig. 4B). The motifs were annotated using the InterProScan program. The results showed that the MYB-like DNA-binding domain and the predicted coiled-coil domain (motif 1 and 2) were found in all BnaMYB-CCs (Fig. 4C). Moreover, similar to the exon/intron organization, the members belonging to the same group also showed similar motif composition. Motif 6 and 7 were found uniquely in the BnaMYB-CC proteins in Group D. Only the proteins of Group A contained motif 8, while motif 10 was found mainly in Group C (Fig. 4C). The other six motifs were widely distributed among the groups. For example, motif 9 was absent in Group A and Group D (Fig. 4C) indicating that there were similarities in the conserved sequences of different phylogenetic groups.

Cis-elements in promoters of MYB-CC genes
To investigate the cis-elements in the promoters of MYB-CC genes, a 2 kb sequence upstream of the start codon of each gene was extracted from the B. napus genome. The promoter sequences were submitted to PlantCARE. As shown in Fig. 5 and Table S6, two types of cis-elements (phytohormone-related and abiotic stress-related elements) were selected. Methyl jasmonate (MeJA)-responsive element (TGACG) and ABA-responsive elements (ABRE) were present in 78.1% and 74.5% of promoters of BnaMYB-CCs, respectively. In the promoters of 55 BnaMYB-CCs, 21 contained auxin-responsive elements (TGA-element, AuxRE, and AuxRR-core), 26 contained SA responsive element (TCA element), and 27 contained gibberellin-responsive elements (GARE-motif, TATC-box, and P-box). Two cis-lements (TC-rich repeats and LTR, related to abiotic stress) were widely distributed in BnaMYB-CC promoters. BnaMYB-CC20 had seven LTRs in its promoter (Fig. 5), indicating that BnaMYB-CC20 may play a vital role in response to low-temperature stresses. The results suggest BnaMYB-CCs may be involved in stress resistance and the hormone signaling pathway. It is worth noting that the cis-elements related to light response were found in significant quantity in the promoters of BnaMYB-CC. However, our work focused on the cis-elements related to phytohormones and thus, light responsive cis-elements are not shown in Fig. 5.

Expression analysis of MYB-CC genes under low Pi stresses
The expression profiles of MYB-CCs under low Pi stress were analyzed by RNA-Seq data to identify potential MYB-CCs with regulatory roles in Pi starvation responses. The original FPKM values of the transcriptome are shown in Table S8. A heatmap was constructed to display diverse expression levels in the roots and shoots under low or high Pi conditions (Fig. 7). The expression of most MYB-CC genes was more active in the roots than that in the shoots, especially BraMYB-CCs and BolMYB-CCs (Fig. 7). The expression patterns of several genes were altered after allopolyploidy. For example, BraMYB-CC24, BolMYB-CC16, and BolMYB-CC31 were mainly expressed in the roots, while their orthologous genes BnaMYB-CC44, BnaMYB-CC26, and BnaMYB-CC23 were expressed dramatically in the shoots (Fig. 7). However, other genes retained the same expression pattern after allopolyploidy, implying that the BnaMYB-CC genes were conserved and maintained similar functions in diploid progenitors ( Fig. 7; Table S8). Approximately half of the BnaMYB-CCs were induced by low Pi stress in both roots and shoots (Fig. 7). Notably, some BnaMYB-CCs were induced twofold or more in roots (BnaMYB-CC01, BnaMYB-CC32, and BnaMYB-CC44) or shoots (BnaMYB-CC04, BnaMYB-CC10, BnaMYB-CC16, and BnaMYB-CC18) under low Pi conditions (Table S8). In B. rapa and B. oleracea, approximately one-third of BraMYB-CCs and two-thirds of BolMYB-CCs genes were up-regulated under low Pi stress in roots ( Fig. 7; Table S8). In B. napus, BnaMYB-CCs of Group D (BnaMYB-CC12, BnaMYB-CC15, BnaMYB-CC48, BnaMYB-CC50, BnaMYB-CC51, and BnaMYB-CC53) were significantly down-regulated in both roots and shoots under low Pi stress ( Fig. 7C; Table S8). Seven BnaMYB-CCs of Group A were moderately induced by low Pi stress in roots, while their expression levels increased in shoots under high and low Pi conditions ( Fig. 7; Table S8). However, BnaMYB-CC10, BnaMYB-CC13, BnaMYB-CC27, and BnaMYB-CC34 of Group H were induced by low Pi stress in roots prominently. All six members of BnaMYB-CCs in Group H were induced by low Pi stress in shoots ( Fig. 7; Table S8). The results indicate that the expression patterns were conservative for MYB-CCs of the same evolutionary group.

DISCUSSION
The MYB-CC family is a subtype within the MYB superfamily. It is contains the MYB domain and the predicted coiled-coil (CC) domain. The MYB-CC family has 15 members in A. thaliana (Rubio et al., 2001). PHR1, PHL1, PHL2, and PHL3 have been redundantly involved in regulating Pi starvation responses (Rubio et al., 2001;Bustos et al., 2010;Sun et al., 2016). PHL4 is one of the MYB-CC in Arabidopsis, and was regarded as an unimportant regulator in responses to Pi starvation (Wang et al., 2018). MYB-CCs have also been shown to be involved in plant growth regulation. One member of the MYB-CC Group A, APL, plays a dual role in inhibiting xylem differentiation and promoting phloem differentiation during vascular development (Bonke et al., 2003). MYR1 and MYR2 of MYB-CC Group C are redundant negative flowering time regulators under low light induction . GFR of MYB-CC Group D is a low-temperature regulator of flavonoid accumulation (Petridis et al., 2016). The MYB-CC family was also mentioned in other plant species, such as 12 ZmMYB-CCs in maize (Bai et al., 2019) and 13 FvMYB-CCs in woodland strawberry . MYB-CC genes in wheat have been reported in response to drought stress (Li et al., 2019c). Additionally, BnPHR1, which is a vital regulator of low Pi responses, was isolated and identified in B. napus (Ren et al., 2012). The comprehensive identification and evolutionary analysis of the MYB-CC family in B. napus and its parental species B. rapa and B. oleracea have not been reported to date. We comprehensively analyzed the MYB-CC family in Brassica species and assayed their expression patterns under low Pi stress. Gene duplication is the main driving force for gene family expansion (Taylor & Raes, 2004). Brassica species shared multiple paleo-polyploidy (whole-genome duplication) events with A. thaliana, providing primitive genetic material for biological evolution and facilitating the adaption to the changing environments (Bowers et al., 2003). Compared to the 15 MYB-CC genes in Arabidopsis, the number of MYB-CC genes in B. napus (55), B. rapa (30), and B. oleracea (34) increased significantly. This is due to the additional whole-genome triplication (WGT) event in the Brassica species after its differentiation from A. thaliana (Lysak et al., 2005;Town et al., 2006;Wang et al., 2011). B. napus is a heterotetraploid formed by hybridization between B. rapa and B. oleracea, followed by chromosome doubling (Cheung et al., 2009;Chalhoub et al., 2014;Wu et al., 2019). The segmental and tandem duplication events contribute to the gene family expansion during evolution (Cannon et al., 2004). We found that 35 BnaMYB-CC genes formed 36 pairs with high gene and protein sequence identity and similarity. All of these gene pairs were identified as segmental repeats, indicating that the amplification of the MYB-CC gene family in B. napus may be independent of tandem duplication. The same duplication pattern was reported in other gene families during the evolution from diploid to allotetraploid (Li et al., 2019a;Li et al., 2019b;Wang et al., 2020), and the gene families were mainly influenced by segmental duplication resulting from WGT and allopolyploidy (Li et al., 2017). Specifically, plants retain many duplicate chromosome blocks in their genomes due to polyploidization events, resulting in segmental duplication (Cannon et al., 2004). Tandem repeats are the result of unequal crossing-over between similar alleles (Achaz et al., 2000). Although no tandem duplication was found in this study, it is still an important mechanism for the expansion of gene family in plants. For instance, tandemly arrayed genes comprise more than 10% of the genes in A. thaliana genome (Rizzon, Ponger & Gaut, 2006).
Approximately 45 MYB-CC genes were predicted to exist in both B. rapa and B. oleracea as a result of the additional WGT in Brassica species. However, only 30 and 34 MYB-CC genes occurred in two diploid species and large-scale gene loss after WGT, respectively (Cheng, Wu & Wang, 2014). The B. rapa genome contains approximately twice the number of genes of A. thaliana due to genomic shrinkage and differential loss of duplicated genes after WGT events (Mun et al., 2009). It has been previously reported that 35% of genes in Brassica were lost by a deletion mechanism when WGT occurred (Town et al., 2006). The concentration of some gene products were changed after WGT, which may lead to an imbalance of gene dose. The relatively low retention frequency of these dose-changed genes may also explain the genetic loss after WGT (Freeling, 2008). The number of MYB-CC genes in B. napus was found to be less than the sum of its two diploid ancestors, indicating gene loss also happened after allopolyploidization. The 13 deleted genes were listed in Table S4, of which five genes belong to the A subgenome and eight belong to the C subgenome. This may be due to the rearrangement of genome sequences after hybridization (Paterson, Bowers & Chapman, 2004). In general, each of MYB-CCs in Arabidopsis was expected to have six homologs in B. napus. However, 12 MYB-CCs of Arabidopsis had less than six homologous genes in B. napus. For example, MYR1 had only two homologs (BnaMYB-CC46 and BnaMYB-CC49), which may be caused by the loss of unnecessary duplicates during evolution. The retained MYB-CC gene replications may play a non-redundant role in B. napus.
MYB-CC transcription factors are mainly involved in the responses to Pi starvation (Rubio et al., 2001;Nilsson, Muller & Nielsen, 2007;Zhou et al., 2008;Valdes-Lopez et al., 2008;Bustos et al., 2010;Ren et al., 2012;Wang et al., 2013;Sun et al., 2016;Ruan et al., 2017;Wang et al., 2018;Wang et al., 2019). Data on MYB-CC genes in B. napus are limited; however, the homologous AtMYB-CCs may predict the functions of BnaMYB-CCs in the same phylogenetic group. In Arabidopsis, PHL2 is modulated by Pi starvation and is redundant with PHR1 in regulating the responses to Pi starvation (Sun et al., 2016). The homologs of PHL2 in B. napus (BnaMYB-CC03, BnaMYB-CC07, BnaMYB-CC17, and BnaMYB-CC19) were only slightly induced in shoots under low Pi conditions ( Fig.7; Table S8), suggesting that the functions of the genes might have changed. The transcription of AtPHR1 was not significantly regulated by Pi starvation (Rubio et al., 2001) and its homologous genes (BnaMYB-CC02, BnaMYB-CC06, BnaMYB-CC39, BnaMYB-CC40, and BnaMYB-CC42) in B. napus were not induced by low Pi stresses either ( Fig.7; Table S8). The homologous genes of AtPHR1 were also inactive at the transcriptional level under low Pi stress in other plant species (Zhou et al., 2008;Wang et al., 2019). Our results support that post-translational modification mainly regulates the activity of PHR1 (Miura et al., 2005). To date, the functions of the majority of MYB-CC genes in Arabidopsis and in other plant species are still unclear. Under low Pi stress, MYB-CC genes are mainly involved in anthocyanin biosynthesis and accumulation, Pi redistribution and homeostasis in plant shoots, and involved in modification of root system architecture and Pi uptake in plant roots. The differentiation of the expression patterns of MYB-CC genes from diploid to allotetraploid suggests that their role assignments in above mentioned processes could be changed. However, the functions of each MYB-CC gene in plant Pi homeostasis and low Pi response need to be further studied.
To further characterize the MYB-CC genes in responses to Pi starvation evolutionarily, we selected 38 pairs of potential orthologous genes for expression analysis. The expression levels of BnaMYB-CCs were significantly lower than in diploid species (Table S9). Some gene pairs with the same location had different expression patterns, whereas others with different location shared similar expression patterns (Table S9). There may be no direct correlation between the relative position of the gene and its expression patterns.

CONCLUSIONS
This study comprehensively evaluated the identification, classification, expression and evolution analyses of MYB-CC gene family of Brassica species. A total of 30, 34, and 55 MYB-CC genes were identified in B. rapa, B. oleracea, and B. napus, respectively. All of the MYB-CC genes were divided into nine groups in the phylogenetic tree. Members of the same group have similar gene structures and conserved motifs, indicating that they could be conserved during evolution. In Brassica, WGT and allotetraploidy were vital for the expansion of MYB-CC genes. Gene loss occurred widely for a number of reasons during the evolutionary process. An analysis of gene expression under low Pi stress showed that there was no significant relationship between the relative positions of MYB-CC genes and their expression patterns. This work will promote the understanding of the MYB-CC gene family and assist further analysis of the specific functions of MYB-CC genes in the Brassica species.

ADDITIONAL INFORMATION AND DECLARATIONS Funding
The National Natural Sciences Foundation of China (Grant No. 31971814 and 31571572) and the Project of National Research and Development of China (Grant No. 2016YFD0100202) funded this research. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.